Instance Smoothed Contrastive Learning for Unsupervised Sentence Embedding

نویسندگان

چکیده

Contrastive learning-based methods, such as unsup-SimCSE, have achieved state-of-the-art (SOTA) performances in learning unsupervised sentence embeddings. However, previous studies, each embedding used for contrastive only derived from one instance, and we call these embeddings instance-level In other words, is regarded a unique class of its own, which may hurt the generalization performance. this study, propose IS-CSE (instance smoothing embedding) to smooth boundaries feature space. Specifically, retrieve dynamic memory buffer according semantic similarity get positive group. Then group are aggregated by self-attention operation produce smoothed instance further analysis. We evaluate our method on standard text (STS) tasks achieve an average 78.30%, 79.47%, 77.73%, 79.42% Spearman’s correlation base BERT-base, BERT-large, RoBERTa-base, RoBERTa-large respectively, 2.05%, 1.06%, 1.16% 0.52% improvement compared unsup-SimCSE.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Contrastive Connectives in Sentence Realization Ranking

We look at the average frequency of contrastive connectives in the SPaRKy Restaurant Corpus with respect to realization ratings by human judges. We implement a discriminative n-gram ranker to model these ratings and analyze the resulting n-gram weights to determine if our ranker learns this distribution. Surprisingly, our ranker learns to avoid contrastive connectives. We look at possible expla...

متن کامل

Supervised and Unsupervised Learning for Sentence Compression

In Statistics-Based Summarization Step One: Sentence Compression, Knight and Marcu (Knight and Marcu, 2000) (K&M) present a noisy-channel model for sentence compression. The main difficulty in using this method is the lack of data; Knight and Marcu use a corpus of 1035 training sentences. More data is not easily available, so in addition to improving the original K&M noisy-channel model, we cre...

متن کامل

Instance Embedding Transfer to Unsupervised Video Object Segmentation

We propose a method for unsupervised video object segmentation by transferring the knowledge encapsulated in image-based instance embedding networks. The instance embedding network produces an embedding vector for each pixel that enables identifying all pixels belonging to the same object. Though trained on static images, the instance embeddings are stable over consecutive video frames, which a...

متن کامل

MILEAGE: Multiple Instance LEArning with Global Embedding

Multiple Instance Learning (MIL) generally represents each example as a collection of instances such that the features for local objects can be better captured, whereas traditional learning methods typically extract a global feature vector for each example as an integral part. However, there is limited research work on investigating which of the two learning scenarios performs better. This pape...

متن کامل

Smoothed Dual Embedding Control

We revisit the Bellman optimality equation with Nesterov’s smoothing technique and provide a unique saddle-point optimization perspective of the policy optimization problem in reinforcement learning based on Fenchel duality. A new reinforcement learning algorithm, called Smoothed Dual Embedding Control or SDEC, is derived to solve the saddle-point reformulation with arbitrary learnable function...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2023

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v37i11.26512